Information Retrieval Based on Word Senses
نویسنده
چکیده
This paper proposes an algorithm for word sense disambiguation based on a vector representation of word similarity derived from lexical co-occurrence. It diiers from standard approaches by allowing for as ne grained distinctions as is warranted by the information at hand, rather than supposing a xed number of senses per word, and by allowing for more than one sense to be assigned to a given word occurrence. The algorithm is applied to the standard vector-space information retrieval model and an evaluation is performed over the Category B TREC-1 corpus (WSJ subcollection). Results show that this sense disambiguation algorithm improves performance by between 7% and 14% on average .
منابع مشابه
Word Sense Disambiguation Improves Information Retrieval
Previous research has conflicting conclusions on whether word sense disambiguation (WSD) systems can improve information retrieval (IR) performance. In this paper, we propose a method to estimate sense distributions for short queries. Together with the senses predicted for words in documents, we propose a novel approach to incorporate word senses into the language modeling approach to IR and al...
متن کاملLSM: Language Sense Model for Information Retrieval
A lot of work has been done on drawing word senses into retrieval to deal with the word sense ambiguity problem, but most of them achieved negative results. In this paper, we first implement a WSD system for nouns and verbs, then the language sense model (LSM) for information retrieval is proposed. The LSM combines the terms and senses of a document seamlessly through an EM algorithm. Retrieval...
متن کاملTopical Clustering of MRD Senses Based on Information Retrieval Techniques
This paper describes a heuristic approach capable of automatically clustering senses in a machinereadable dictionary (MRD). Including these clusters in the MRD-based lexical database offers several positive benefits for word sense disambiguation (WSD). First, the clusters can be used as a coarser sense division, so unnecessarily fine sense distinction can be avoided. The clustered entries in th...
متن کاملMultiple Word senses and Information Retrieval: An application using thesaurally derived Lexical Chains
The primary objective of this work is to Improve Internet based Information Retrieval. Currently Internet search engines retrieve a heterogeneous collection of documents of varied quality. Whilst many are “relevant” to the search terms used, many others coincidentally contain a matched word. They do not, in other words, have meaningful content. An enabling objective is to develop a "weakly" int...
متن کاملLexical Disambiguation Using Constraint Handling In Prolog (CHIP)
Automatic sense disambiguation has been recognised by the research community as very important for a number of natural language processing applications like information retrieval, machine translation, or speech recognition. This paper describes experiments with an algorithm for lexieal sense disambiguation, that is, predicting which of many possible senses of a word is intended in a given sente...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1995